[pull] main from openai:main by pull[bot] · Pull Request #58 · kontext-security/codex

pull · 2026-03-12T00:25:27Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

) `exec()` had a number of arguments that were unused, making the function signature misleading. This PR aims to clean things up to clarify the role of this function and to clarify which fields of `ExecParams` are unused and why.

## Summary - Allow app-server websocket capability auth to accept a precomputed SHA-256 digest via `--ws-token-sha256`. - Keep token-file support and enforce exactly one capability token source. - Document the new auth flag. ## Testing - `just fmt` - `cargo test -p codex-app-server transport::auth::tests` - `cargo test -p codex-app-server websocket_capability_token_sha256_args_parse` - `cargo test -p codex-cli app_server_capability_token_flags_parse` - `cargo clippy -p codex-app-server --all-targets -- -D warnings` - `just fix -p codex-cli` --------- Co-authored-by: Codex <noreply@openai.com>

## Changes Allows MCPs to opt in to receiving sandbox config info through `_meta` on model-initiated tool calls. This lets MCPs adhere to the thread's sandbox if they choose to. ## Details - Adds the `codex/sandbox-state-meta` experimental MCP capability. - Tracks whether each MCP server advertises that capability. - When a server opts in, `codex-core` injects the current `SandboxState` into model-initiated MCP tool-call request `_meta`. ## Verification - added an integration test for the capability

## Changes Allows sandboxes to restrict overall network access while granting access to specific unix sockets on mac. ## Details - `codex sandbox macos`: adds a repeatable `--allow-unix-socket` option. - `codex-sandboxing`: threads explicit Unix socket roots into the macOS Seatbelt profile generation. - Preserves restricted network behavior when only Unix socket IPC is requested, and preserves full network behavior when full network is already enabled. ## Verification - `cargo test -p codex-cli -p codex-sandboxing` - `cargo build -p codex-cli --bin codex` - verified that `codex sandbox macos --allow-unix-socket /tmp/test.sock -- test-client` grants access as expected

## Summary - Fix marketplace-add local path detection on Windows by using `Path::is_absolute()`. - Make marketplace-add local-source tests parse/write TOML through the same helpers instead of raw string matching. - Update `rand` 0.9.x to 0.9.3 and document the remaining audited `rand` 0.8.5 advisory exception. - Refresh `MODULE.bazel.lock` after the Cargo.lock update. ## Why Latest `main` had two independent CI blockers: marketplace-add tests were not portable to Windows path/TOML escaping, and cargo-deny still reported `RUSTSEC-2026-0097` after the recent rustls-webpki fix. ## Validation - `cargo test -p codex-core marketplace_add -- --nocapture` - `cargo deny --all-features check` - `just bazel-lock-check` - `just fix -p codex-core` - `just fmt` - `git diff --check`

Add menu that: 1. If memories feature is not enabled, propose to enable it 2. Let you choose if you want to generate memories and to use memories

stacked on #17402. MCP tools returned by `tool_search` (deferred tools) get registered in our `ToolRegistry` with a different format than directly available tools. this leads to two different ways of accessing MCP tools from our tool catalog, only one of which works for each. fix this by registering all MCP tools with the namespace format, since this info is already available. also, direct MCP tools are registered to responsesapi without a namespace, while deferred MCP tools have a namespace. this means we can receive MCP `FunctionCall`s in both formats from namespaces. fix this by always registering MCP tools with namespace, regardless of deferral status. make code mode track `ToolName` provenance of tools so it can map the literal JS function name string to the correct `ToolName` for invocation, rather than supporting both in core. this lets us unify to a single canonical `ToolName` representation for each MCP tool and force everywhere to use that one, without supporting fallbacks.

## Summary - Remove the exec-server-side manual filesystem request path preflight before invoking the sandbox helper. - Keep sandbox helper policy construction and platform sandbox enforcement as the access boundary. - Add a portable local+remote regression for writing through an explicitly configured alias root. - Remove the metadata symlink-escape assertion that depended on the deleted manual preflight; no replacement metadata-specific access probe is added. ## Tests - `cargo test -p codex-exec-server --lib` - `cargo test -p codex-exec-server --test file_system` - `git diff --check`

## Summary Stack PR 2 of 4 for feature-gated agent identity support. This PR adds agent identity registration behind `features.use_agent_identity`. It keeps the app-server protocol unchanged and starts registration after ChatGPT auth exists rather than requiring a client restart. ## Stack - PR1: #17385 - add `features.use_agent_identity` - PR2: #17386 - this PR - PR3: #17387 - register agent tasks when enabled - PR4: #17388 - use `AgentAssertion` downstream when enabled ## Validation Covered as part of the local stack validation pass: - `just fmt` - `cargo test -p codex-core --lib agent_identity` - `cargo test -p codex-core --lib agent_assertion` - `cargo test -p codex-core --lib websocket_agent_task` - `cargo test -p codex-api api_bridge` - `cargo build -p codex-cli --bin codex` ## Notes The full local app-server E2E path is still being debugged after PR creation. The current branch stack is directionally ready for review while that follow-up continues.

## Summary - Skip directory entries whose metadata lookup fails during `fs/readDirectory` - Add an exec-server regression test covering a broken symlink beside valid entries ## Testing - `just fmt` - `cargo test -p codex-exec-server` (started, but dependency/network updates stalled before completion in this environment)

## Summary Setting this up ## Testing - [x] Unit tests pass

…ovider (#17965) ## Why While reviewing #17958, the helper name `is_azure_responses_wire_base_url` looked misleading because the helper returns true for either the `azure` provider name or an Azure Responses `base_url`. The new name makes both inputs part of the contract. ## What - Rename `is_azure_responses_wire_base_url` to `is_azure_responses_provider`. - Move the `openai.azure.` marker into `matches_azure_responses_base_url` so all base URL marker matching is centralized. - Keep `Provider::is_azure_responses_endpoint()` behavior unchanged. ## Verification - Compared the parent and current implementations. `name.eq_ignore_ascii_case("azure")` still returns true before consulting `base_url`, `None` still returns false, base URLs are still lowercased before marker matching, and the same Azure marker set is checked. - Ran `cargo test -p codex-api`.

## Summary - Ensure direct namespaced MCP tool groups are emitted with a non-empty namespace description even when namespace metadata is missing or blank. - Add regression coverage for missing MCP namespace descriptions. ## Cause Latest `main` can serialize a direct namespaced MCP tool group with an empty top-level `description`. The namespace description path used `unwrap_or_default()` when `tool_namespaces` did not include metadata for that namespace, so the outbound Responses API payload could contain a tool like `{"type":"namespace","description":""}`. The Responses API rejects that because namespace tool descriptions must be a non-empty string. ## Fix - Add a fallback namespace description: `Tools in the <namespace> namespace.` - Preserve provided namespace descriptions after trimming, but treat blank descriptions as missing. ### Issue I am seeing This is what I am seeing on the local build. <img width="1593" height="488" alt="Screenshot 2026-04-15 at 10 55 55 AM" src="https://github.com/user-attachments/assets/bab668ba-bf17-4c71-be4e-b102202fce57" /> --------- Co-authored-by: Sayan Sisodiya <sayan@openai.com>

Builds on top of #17659 Move the filesystem + sqlite thread listing-related operations inside of a local ThreadStore implementation and call ThreadStore from the places that used to perform these filesystem/sqlite operations. This is the first of a series of PRs that will implement the rest of the local ThreadStore. Testing: - added unit tests for the thread store implementation - adjusted some unit tests in the realtime + personality packages whose callsites changed. Specifically I'm trying to hide ThreadMetadata inside of the local implementation and make ThreadMetadata a sqlite implementation detail concern rather than a public interface, preferring the more generate StoredThread interface instead - added a corner case test for the personality migration package that wasn't covered by the existing test suite - adjust the behavior of searched thread listing to run the existing local rollout repair/backfill pass _before_ querying SQLite results, so callers using ThreadStore::list_threads do not miss matches after a partial metadata warm-up

## Summary - Keep the existing local-build test announcement as the first announcement entry - Add the CLI update reminder for versions below `0.120.0` - Remove expired onboarding and gpt-5.3-codex announcement entries <img width="1576" height="276" alt="Screenshot 2026-04-15 at 1 32 53 PM" src="https://github.com/user-attachments/assets/10b55d0b-09cd-4de0-ab51-4293d811b80c" />

## Summary - Move auth header construction into the `AuthProvider::add_auth_headers` contract. - Inline `CoreAuthProvider` header mutation in its provider impl and remove the shared header-map helper. - Update HTTP, websocket, file upload, sideband websocket, and test auth callsites to use the provider method. - Add direct coverage for `CoreAuthProvider` auth header mutation. ## Testing - `just fmt` - `cargo test -p codex-api` - `cargo test -p codex-core client::tests::auth_request_telemetry_context_tracks_attached_auth_and_retry_phase` - `cargo test -p codex-core` failed on unrelated/reproducible `tools::handlers::multi_agents::tests::multi_agent_v2_followup_task_interrupts_busy_child_without_losing_message` --------- Co-authored-by: Celia Chen <celia@openai.com>

## Summary - Track outbound remote-control sequence IDs independently for each client stream. - Retain unacked outbound messages per stream using FIFO buffers. - Require stream-scoped acks and update tests for contiguous per-stream sequencing. ## Why The remote-control peer uses outbound sequence gaps to detect lost messages and re-initialize. A single global outbound sequence counter can create apparent gaps on an individual stream when another stream receives an interleaved message. ## Validation - `just fmt` - `cargo test -p codex-app-server remote_control` - `just fix -p codex-app-server` - `git diff --check`

## Why #17763 moved sandbox-state delivery for MCP tool calls to request `_meta` via the `codex/sandbox-state-meta` experimental capability. Keeping the older `codex/sandbox-state` capability meant Codex still maintained a second transport that pushed updates with the custom `codex/sandbox-state/update` request at server startup and when the session sandbox policy changed. That duplicate MCP path is redundant with the per-tool-call metadata path and makes the sandbox-state contract larger than needed. The existing managed network proxy refresh on sandbox-policy changes is still needed, so this keeps that behavior separate from the removed MCP notification. ## What Changed - Removed the exported `MCP_SANDBOX_STATE_CAPABILITY` and `MCP_SANDBOX_STATE_METHOD` constants. - Removed detection of `codex/sandbox-state` during MCP initialization and stopped sending `codex/sandbox-state/update` at server startup. - Removed the `McpConnectionManager::notify_sandbox_state_change` plumbing while preserving the managed network proxy refresh when a user turn changes sandbox policy. - Slimmed `McpConnectionManager::new` so startup paths pass only the initial `SandboxPolicy` needed for MCP elicitation state. - Kept `codex/sandbox-state-meta` support intact; servers that opt in still receive the current `SandboxState` on tool-call request `_meta` ([remaining call path](https://github.com/openai/codex/blob/ff2d3c1e72ff08ce13743b99605d19d338edd51c/codex-rs/core/src/mcp_tool_call.rs#L487-L526)). - Added regression coverage for refreshing the live managed network proxy on a per-turn sandbox-policy change. ## Verification - `cargo test -p codex-core new_turn_refreshes_managed_network_proxy_for_sandbox_change` - `cargo test -p codex-mcp`

Migrate the conversation summary App Server methods to ThreadStore Because this app server api allows explicitly fetching the thread by rollout path, intercept that case in the app server code and (a) route directly to underlying local thread store methods if we're using a local thread store, or (b) throw an unsupported error if we're using a remote thread store. This keeps the thread store API clean and all filesystem operations inside of the local thread store, which pushing the "fundamental incompatibility" check as early as possible.

## Summary Adds a second realtime v2 function tool, `remain_silent`, so the realtime model has an explicit non-speaking action when the collaboration mode or latest context says it should not answer aloud. This is stacked on #18597. ## Design - Advertise `remain_silent` alongside `background_agent` in realtime v2 conversational sessions. - Parse `remain_silent` function calls into a typed `RealtimeEvent::NoopRequested` event. - Have core answer that function call with an empty `function_call_output` and deliberately avoid `response.create`, so no follow-up realtime response is requested. - Keep the event hidden from app-server/TUI surfaces; it is operational plumbing, not user-visible conversation content.

## Summary - add a codex-uds crate with async UnixListener and UnixStream wrappers - expose helpers for private socket directory setup and stale socket path checks - migrate codex-stdio-to-uds onto codex-uds and Tokio-based stdio/socket relaying - update the CLI stdio-to-uds command path for the async runner ## Tests - cargo test -p codex-uds -p codex-stdio-to-uds - cargo test -p codex-cli - just fmt - just fix -p codex-uds - just fix -p codex-stdio-to-uds - just fix -p codex-cli - just bazel-lock-check - git diff --check

Adds a skill that centralizes rules used during code review for codex.

## Why Cloud-hosted sessions need a way for the service that starts or manages a thread to provide session-owned config without treating all config as if it came from the same user/project/workspace TOML stack. The important boundary is ownership: some values should be controlled by the session/orchestrator, some by the authenticated user, and later some may come from the executor. The earlier broad config-store shape made that boundary too fuzzy and overlapped heavily with the existing filesystem-backed config loader. This PR starts with the smaller piece we need now: a typed session config loader that can feed the existing config layer stack while preserving the normal precedence and merge behavior. ## What Changed - Added `ThreadConfigLoader` and related typed payloads in `codex-config`. - `SessionThreadConfig` currently supports `model_provider`, `model_providers`, and feature flags. - `UserThreadConfig` is present as an ownership boundary, but does not yet add TOML-backed fields. - `NoopThreadConfigLoader` preserves existing behavior when no external loader is configured. - `StaticThreadConfigLoader` supports tests and simple callers. - Taught thread config sources to produce ordinary `ConfigLayerEntry` values so the existing `ConfigLayerStack` remains the place where precedence and merging happen. - Wired the loader through `ConfigBuilder`, the config loader, and app-server startup paths so app-server can provide session-owned config before deriving a thread config. - Added coverage for: - translating typed thread config into config layers, - inserting thread config layers into the stack at the right precedence, - applying session-provided model provider and feature settings when app-server derives config from thread params. ## Follow-Ups This intentionally stops short of adding the remote/service transport. The next pieces are expected to be: 1. Define the proto/API shape for this interface. 2. Add a client implementation that can source session config from the service side. ## Verification - Added unit coverage in `codex-config` for the loader and layer conversion. - Added `codex-core` config loader coverage for thread config layer precedence. - Added app-server coverage that verifies session thread config wins over request-provided config for model provider and feature settings.

## Why The TUI app module had grown past the 512K source-file cap enforced by CI/CD. This keeps the app entry point below that limit while preserving the existing runtime behavior and test surface. ## What changed - Kept the top-level `App` state and run-loop wiring in `tui/src/app.rs`. - Split app responsibilities into focused private submodules under `tui/src/app/`, covering event dispatch, thread routing, session lifecycle, config persistence, background requests, startup prompts, input, history UI, platform actions, and thread event buffering. - Moved the existing app-level tests into `tui/src/app/tests.rs` and reused the existing snapshot location rather than adding new tests or snapshots. - Added module header comments for `app.rs` and the new submodules. ## Follow-up A future cleanup can move narrow unit tests from `tui/src/app/tests.rs` into the specific app submodules they exercise. This PR keeps the existing app-level tests together so the refactor stays focused on the source-file split. ## Verification - `cargo test -p codex-tui --lib app::tests::agent_picker_item_name_snapshot` - `cargo test -p codex-tui --lib app::tests::clear_ui` - `cargo test -p codex-tui --lib app::tests::ctrl_l_clear_ui_after_long_transcript_reuses_clear_header_snapshot` - `just fix -p codex-tui` Full `cargo test -p codex-tui` still fails on model-catalog drift unrelated to this refactor, including stale `gpt-5.3-codex`/`gpt-5.1-codex` snapshot and migration expectations now resolving to `gpt-5.4`.

Addresses #18113 Problem: Shared flags provided before the exec subcommand were parsed by the root CLI but not inherited by the exec CLI, so exec sessions could run with stale or default sandbox and model configuration. Solution: Move shared TUI and exec flags into a common option block and merge root selections into exec before dispatch, while preserving exec's global subcommand flag behavior.

## Summary - Reject new exec-server client operations once the transport has disconnected. - Convert pending RPC calls into closed errors instead of synthetic server errors. - Cover pending read and later write behavior after remote executor disconnect. ## Verification - `just fmt` - `cargo check -p codex-exec-server` ## Stack ```text @ #18027 [6/6] Fail exec client operations after disconnect │ o #18212 [5/6] Wire executor-backed MCP stdio │ o #18087 [4/6] Abstract MCP stdio server launching │ o #18020 [3/6] Add pushed exec process events │ o #18086 [2/6] Support piped stdin in exec process API │ o #18085 [1/6] Add MCP server environment config │ o main ``` --------- Co-authored-by: Codex <noreply@openai.com>

## Summary This fixes a stale-environment path in shell snapshot restoration. A sandboxed command can source a shell snapshot that was captured while an older proxy process was running. If that proxy has died and come back on a different port, the snapshot can otherwise put old proxy values back into the command environment, which is how tools like `pip` end up talking to a dead proxy. The wrapper now captures the live process environment before sourcing the snapshot and then restores or clears every proxy env var from the proxy crate's canonical list. That makes proxy state after shell snapshot restoration match the current command environment, rather than whatever proxy values happened to be present in the snapshot. On macOS, the Codex-generated `GIT_SSH_COMMAND` is refreshed when the SOCKS listener changes, while custom SSH wrappers are still left alone. --------- Co-authored-by: Codex <noreply@openai.com>

Addresses #18505 ## Summary When Codex is launched from a subdirectory of a Git repository, the onboarding trust prompt says it is trusting the current directory even though the persisted trust target is the repository root. That can make the scope of the trust decision unclear. This updates the TUI trust prompt to show a yellow note only when the current directory differs from the resolved trust target, explaining that trust applies to the repository root and displaying that root. It also removes the stale onboarding TODO now that the warning is implemented.

## Summary - Track how many realtime transcript entries have already been attached to a background-agent handoff. - Attach only entries added since the previous handoff as `<transcript_delta>` instead of resending the accumulated transcript snapshot. - Update the realtime integration test so the second delegation carries only the second transcript delta. ## Validation - `just fmt` - `cargo test -p codex-api` - `cargo test -p codex-core inbound_handoff_request_sends_transcript_delta_after_each_handoff` - `cargo build -p codex-cli -p codex-app-server` ## Manual testing Built local debug binaries at: - `codex-rs/target/debug/codex` - `codex-rs/target/debug/codex-app-server`

This PR makes the `/statusline` and `/title` setup UIs share one preview-value source instead of each surface using its own examples. Both pickers now render consistent live values when available, and stable placeholders when they are not. It also resolves live preview values at the shared preview-item layer, so `/title` preview can use real runtime values for title-specific cases like status text, task progress, and project-name fallback behavior. - Adds a shared preview data model for status surfaces - Maps status-line items and terminal-title items onto that shared preview list - Feeds both setup views from the same chatwidget-derived preview data, with terminal-title-specific formatting applied before `/title` preview renders - Keeps project-root preview aligned with status-line behavior while project in /title keeps its title fallback/truncation behavior - Adds snapshot coverage for live-only, hardcoded-only, and mixed cases Test Steps - Open Codex TUI and launch `/statusline`. - Toggle and reorder items, then verify the preview uses current session values when possible, and placeholder values for missing values (ex: no thread ID). - Open `/title` and verify it shows the same normalized values, including live status/task-progress values when available.

## Why Codex needs a first-class `amazon-bedrock` model provider so users can select Bedrock without copying a full provider definition into `config.toml`. The provider has Codex-owned defaults for the pieces that should stay consistent across users: the display `name`, Bedrock `base_url`, and `wire_api`. At the same time, users still need a way to choose the AWS credential profile used by their local environment. This change makes `amazon-bedrock` a partially modifiable built-in provider: code owns the provider identity and endpoint defaults, while user config can set `model_providers.amazon-bedrock.aws.profile`. For example: ```toml model_provider = "amazon-bedrock" [model_providers.amazon-bedrock.aws] profile = "codex-bedrock" ``` ## What Changed - Added `amazon-bedrock` to the built-in model provider map with: - `name = "Amazon Bedrock"` - `base_url = "https://bedrock-mantle.us-east-1.api.aws/v1"` - `wire_api = "responses"` - Added AWS provider auth config with a profile-only shape: `model_providers.<id>.aws.profile`. - Kept AWS auth config restricted to `amazon-bedrock`; custom providers that set `aws` are rejected. - Allowed `model_providers.amazon-bedrock` through reserved-provider validation so it can act as a partial override. - During config loading, only `aws.profile` is copied from the user-provided `amazon-bedrock` entry onto the built-in provider. Other Bedrock provider fields remain hard-coded by the built-in definition. - Updated the generated config schema for the new provider AWS profile config.

### Why Remote streamable HTTP MCP needs a transport-shaped executor primitive before the MCP client can move network I/O to the executor. This layer keeps the executor unaware of MCP and gives later PRs an ordered streaming surface for response bodies. ### What - Add typed `http/request` and `http/request/bodyDelta` protocol payloads. - Add executor client helpers for buffered and streamed HTTP responses. - Route body-delta notifications to request-scoped streams with sequence validation and cleanup when a stream finishes or is dropped. - Document the new protocol constants, transport structs, public client methods, body-stream lifecycle, and request-scoped routing helpers. - Add in-memory JSON-RPC client coverage for streamed HTTP response-body notifications, with comments spelling out what the test proves and each setup/exercise/assert phase. ### Stack 1. #18581 protocol 2. #18582 runner 3. #18583 RMCP client 4. #18584 manager wiring and local/remote coverage ### Verification - `just fmt` - `cargo check -p codex-exec-server -p codex-rmcp-client --tests` - `cargo check -p codex-core --test all` compile-only - `git diff --check` - Online full CI is running from the `full-ci` branch, including the remote Rust test job. Co-authored-by: Codex <noreply@openai.com> --------- Co-authored-by: Codex <noreply@openai.com>

## Why This is part of the follow-up work from #18178 to make Codex ready for Clippy's [`await_holding_lock`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_lock) / [`await_holding_invalid_type`](https://rust-lang.github.io/rust-clippy/master/index.html#await_holding_invalid_type) lints. This bottom PR keeps the scope intentionally small: `NetworkProxyState::record_blocked()` only needs the state write lock while it mutates the blocked-request ring buffer and counters. The debug log payload and `BlockedRequestObserver` callback can be produced after that lock is released. ## What changed - Copies the blocked-request snapshot values needed for logging while updating the state. - Releases the `RwLockWriteGuard` before logging or notifying the observer. ## Verification - `cargo test -p codex-network-proxy` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18418). * #18698 * #18423 * __->__ #18418

Update core-plugins MCP loading to accept either an mcpServers object or a top-level server map in .mcp.json

## Why #18274 made `PermissionProfile` the canonical file-system permissions shape, but the round-trip from `FileSystemSandboxPolicy` to `PermissionProfile` still dropped one piece of policy metadata: `glob_scan_max_depth`. That field is security-relevant for deny-read globs such as `**/*.env`. On Linux, bubblewrap sandbox construction uses it to bound unreadable glob expansion. If a profile copied from active runtime permissions loses this value and is submitted back as an override, the resulting `FileSystemSandboxPolicy` can behave differently even though the visible permission entries look equivalent. ## What changed - Add `glob_scan_max_depth` to protocol `FileSystemPermissions` and preserve it when converting to/from `FileSystemSandboxPolicy`. - Keep legacy `read`/`write` JSON for simple path-only permissions, but force canonical JSON when glob scan depth is present so the metadata is not silently dropped. - Carry `globScanMaxDepth` through app-server `AdditionalFileSystemPermissions`, generated JSON/TypeScript schemas, and app-server/TUI conversion call sites. - Preserve the metadata through sandboxing permission normalization, merging, and intersection. - Carry the merged scan depth into the effective `FileSystemSandboxPolicy` used for command execution, so bounded deny-read globs reach Linux bubblewrap materialization. ## Verification - `cargo test -p codex-sandboxing glob_scan -- --nocapture` - `cargo test -p codex-sandboxing policy_transforms -- --nocapture` - `just fix -p codex-sandboxing` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18713). * #18288 * #18287 * #18286 * #18285 * #18284 * #18283 * #18282 * #18281 * #18280 * #18279 * #18278 * #18277 * #18276 * #18275 * __->__ #18713

## Summary This shouldn't error for now ## Test plan - [x] Updated unit test

## Summary Making thread id optional so that we can better cache resources for MCPs for connectors since their resource templates is universal and not particular to projects. - Make `mcpServer/resource/read` accept an optional `threadId` - Read resources from the current MCP config when no thread is supplied - Keep the existing thread-scoped path when `threadId` is present - Update the generated schemas, README, and integration coverage ## Testing - `just write-app-server-schema` - `just fmt` - `cargo test -p codex-app-server-protocol` - `cargo test -p codex-mcp` - `cargo test -p codex-app-server --test all mcp_resource` - `just fix -p codex-mcp` - `just fix -p codex-app-server-protocol` - `just fix -p codex-app-server`

This updates TUI skill mentions to show a fallback label when a skill does not define a display name, so unnamed skills remain understandable in the picker without changing behavior for skills that already have one. <img width="1028" height="198" alt="Screenshot 2026-04-20 at 6 25 15 PM" src="https://github.com/user-attachments/assets/84077b85-99d0-4db9-b533-37e1887b4506" />

## Summary Start marking app-server schema files as [linguist-generated](https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github), so we can more easily parse reviews

## Summary When auto-review is enabled, it should handle request_permissions tool. We'll need to clean up the UX but I'm planning to do that in a separate pass ## Testing - [x] Ran locally <img width="893" height="396" alt="Screenshot 2026-04-17 at 1 16 13 PM" src="https://github.com/user-attachments/assets/4c045c5f-1138-4c6c-ac6e-2cb6be4514d8" /> --------- Co-authored-by: Codex <noreply@openai.com>

## Why Customers need finer-grained control over allowed sandbox modes based on the host Codex is running on. For example, they may want stricter sandbox limits on devboxes while keeping a different default elsewhere. Our current cloud requirements can target user/account groups, but they cannot vary sandbox requirements by host. That makes remote development environments awkward because the same top-level `allowed_sandbox_modes` has to apply everywhere. ## What Adds a new `remote_sandbox_config` section to `requirements.toml`: ```toml allowed_sandbox_modes = ["read-only"] [[remote_sandbox_config]] hostname_patterns = ["*.org"] allowed_sandbox_modes = ["read-only", "workspace-write"] [[remote_sandbox_config]] hostname_patterns = ["*.sh", "runner-*.ci"] allowed_sandbox_modes = ["read-only", "danger-full-access"] ``` During requirements resolution, Codex resolves the local host name once, preferring the machine FQDN when available and falling back to the cleaned kernel hostname. This host classification is best effort rather than authenticated device proof. Each requirements source applies its first matching `remote_sandbox_config` entry before it is merged with other sources. The shared merge helper keeps that `apply_remote_sandbox_config` step paired with requirements merging so new requirements sources do not have to remember the extra call. That preserves source precedence: a lower-precedence requirements file with a matching `remote_sandbox_config` cannot override a higher-precedence source that already set `allowed_sandbox_modes`. This also wires the hostname-aware resolution through app-server, CLI/TUI config loading, config API reads, and config layer metadata so they all evaluate remote sandbox requirements consistently. ## Verification - `cargo test -p codex-config remote_sandbox_config` - `cargo test -p codex-config host_name` - `cargo test -p codex-core load_config_layers_applies_matching_remote_sandbox_config` - `cargo test -p codex-core system_remote_sandbox_config_keeps_cloud_sandbox_modes` - `cargo test -p codex-config` - `cargo test -p codex-core` unit tests passed; `tests/all.rs` integration matrix was intentionally stopped after the relevant focused tests passed - `just fix -p codex-config` - `just fix -p codex-core` - `cargo check -p codex-app-server`

Organize context fragments under `core/context`. Implement same trait on all of them.

## Why This PR prepares the stack to enable Clippy await-holding lints that were left disabled in #18178. The mechanical lock-scope cleanup is handled separately; this PR is the documentation/configuration layer for the remaining await-across-guard sites. Without explicit annotations, reviewers and future maintainers cannot tell whether an await-holding warning is a real concurrency smell or an intentional serialization boundary. ## What changed - Configures `clippy.toml` so `await_holding_invalid_type` also covers `tokio::sync::{MutexGuard,RwLockReadGuard,RwLockWriteGuard}`. - Adds targeted `#[expect(clippy::await_holding_invalid_type, reason = ...)]` annotations for intentional async guard lifetimes. - Documents the main categories of intentional cases: active-turn state transitions that must remain atomic, session-owned MCP manager accesses, remote-control websocket serialization, JS REPL kernel/process serialization, OAuth persistence, external bearer token refresh serialization, and tests that intentionally serialize shared global or session-owned state. - For external bearer token refresh, documents the existing serialization boundary: holding `cached_token` across the provider command prevents concurrent cache misses from starting duplicate refresh commands, and the current behavior is small enough that an explicit expectation is easier to maintain than adding another synchronization primitive. ## Verification - `cargo clippy -p codex-login --all-targets` - `cargo clippy -p codex-connectors --all-targets` - `cargo clippy -p codex-core --all-targets` - The follow-up PR #18698 enables `await_holding_invalid_type` and `await_holding_lock` as workspace `deny` lints, so any undocumented remaining offender will fail Clippy. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/18423). * #18698 * __->__ #18423

Follow-up to #18178, where we said the await-holding clippy rule would be enabled separately. Enable `await_holding_lock` and `await_holding_invalid_type` after the preceding commits fixed or explicitly documented the current offenders.

Deferred dynamic tools need to round-trip a namespace so a tool returned by `tool_search` can be called through the same registry key that core uses for dispatch. This change adds namespace support for dynamic tool specs/calls, persists it through app-server thread state, and routes dynamic tool calls by full `ToolName` while still sending the app the leaf tool name. Deferred dynamic tools must provide a namespace; non-deferred dynamic tools may remain top-level. It also introduces `LoadableToolSpec` as the shared function-or-namespace Responses shape used by both `tool_search` output and dynamic tool registration, so dynamic tools use the same wrapping logic in both paths. Validation: - `cargo test -p codex-tools` - `cargo test -p codex-core tool_search` --------- Co-authored-by: Sayan Sisodiya <sayan@openai.com>

## Summary This updates the code review orchestrator skill wording so the instruction explicitly requires returning every issue from every subagent. ## Impact The change is limited to `.codex/skills/code-review/SKILL.md` and clarifies review aggregation behavior for future Codex-driven reviews. ## Validation No tests were run because this is a markdown-only skill wording change.

## What - Explicitly show our "bash mode" by changing the color and adding a callout similar to how we do for `Plan mode (shift + tab to cycle)` - Also replace our `›` composer prefix with a bang `!` ![](https://github.com/user-attachments/assets/f5549c75-3a03-433d-aa57-e4c6d0682c49) ## Why - It was unclear that we had a Bash mode - This feels more responsive - It looks cool! --------- Co-authored-by: Codex <noreply@openai.com>

Fixes #13638 ## Why VS Code's integrated terminal can run a Linux shell through WSL without exposing `TERM_PROGRAM` to the Linux process, and with crossterm keyboard enhancement flags enabled that environment can turn dead-key composition into malformed key events instead of composed Unicode input. Codex already handles composed Unicode correctly, so the fix is to avoid enabling the terminal mode that breaks this path for the affected terminal combination. ## What Changed - Automatically skip crossterm keyboard enhancement flags when Codex detects WSL plus VS Code, including a Windows-side `TERM_PROGRAM` probe through WSL interop. - Add `CODEX_TUI_DISABLE_KEYBOARD_ENHANCEMENT` so users can force-disable or force-enable the keyboard enhancement policy for diagnosis. ## Verification - Added unit coverage for env parsing, VS Code detection, and the WSL/VS Code auto-disable policy. - `cargo check -p codex-tui` passed. - `./tools/argument-comment-lint/run.py -p codex-tui -- --tests` passed. - `cargo test -p codex-tui` was attempted locally, but the checkout failed during linking before tests executed because V8 symbols from `codex-code-mode` were unresolved for `arm64`.

pull bot locked and limited conversation to collaborators Mar 12, 2026

pull bot added ⤵️ pull merge-conflict Sync PR has merge conflicts labels Mar 12, 2026

bolinfest and others added 27 commits April 15, 2026 04:40

fix: cargo deny (#17915)

1324800

feat: add endpoint to delete memories (#17913)

7579d5a

feat: cleaning of memories extension (#17844)

b6244f7

chore: exp flag (#17921)

af9230d

chore: do not disable memories for past rollouts on reset (#17919)

5e544be

nit: stable test (#17924)

544b4e3

feat: memories menu (#17632)

9402347

Add menu that: 1. If memories feature is not enabled, propose to enable it 2. Let you choose if you want to generate memories and to use memories

nit: doc (#17941)

ea13527

feat: sanitize rollouts before phase 1 (#17938)

ec13aaa

feat: reset memories button (#17937)

da86ced

<img width="720" height="175" alt="Screenshot 2026-04-15 at 14 35 02" src="https://github.com/user-attachments/assets/041d73ff-8c16-42a9-8e92-c245805084f0" />

chore(features) codex dependencies feat (#17960)

652380d

## Summary Setting this up ## Testing - [x] Unit tests pass

wiltzius-openai and others added 30 commits April 20, 2026 22:39

Add Code Review skill (#18746)

513dc28

Adds a skill that centralizes rules used during code review for codex.

feat: Support more plugin MCP file shapes. (#18780)

6e9e2c2

Update core-plugins MCP loading to accept either an mcpServers object or a top-level server map in .mcp.json

fix(guardian) Dont hard error on feature disable (#18795)

58e7605

## Summary This shouldn't error for now ## Test plan - [x] Updated unit test

chore(app-server) linguist-generated (#18807)

543a08d

## Summary Start marking app-server schema files as [linguist-generated](https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github), so we can more easily parse reviews

Organize context fragments (#18794)

4c2e730

Organize context fragments under `core/context`. Implement same trait on all of them.

chore: enable await-holding clippy lints (#18698)

1dcea72

Follow-up to #18178, where we said the await-holding clippy rule would be enabled separately. Enable `await_holding_lock` and `await_holding_invalid_type` after the preceding commits fixed or explicitly documented the current offenders.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from openai:main#58

[pull] main from openai:main#58
pull[bot] wants to merge 1215 commits intokontext-security:mainfrom
openai:main

pull bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

pull bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

pull bot commented Mar 12, 2026 •

edited

Loading